# Multimodal training
Mobileclip S2 Timm
MobileCLIP-S2 is an efficient image-text model that achieves rapid inference through multimodal reinforcement training, delivering outstanding zero-shot performance while maintaining a compact size.
Text-to-Image
M
apple
147
4
Mobileclip S0 Timm
MobileCLIP-S0 is an efficient image-text model achieved through multimodal reinforcement training, significantly improving speed and size efficiency while maintaining high performance.
Text-to-Image
M
apple
532
10
Featured Recommended AI Models